Clustering with Confidence: A Low-Dimensional Binning Approach

نویسنده

  • Rebecca Nugent
چکیده

We present a plug-in method for estimating the cluster tree of a density. The method takes advantage of the ability to exactly compute the level sets of a piecewise constant density estimate. We then introduce clustering with confidence, an automatic pruning procedure that assesses significance of splits (and so clusters) in the cluster tree; the only user input required is the desired confidence level.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

‘Multi View Graphing’: Synchronous Linked Multi Visualization utilising Brushing, Binning and Clustering

................................................................................................................................ 1 Declaration ............................................................................................................................ 2 Copyright ........................................................................................................................

متن کامل

MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample

MOTIVATION Metagenomic binning remains an important topic in metagenomic analysis. Existing unsupervised binning methods for next-generation sequencing (NGS) reads do not perform well on (i) samples with low-abundance species or (ii) samples (even with high abundance) when there are many extremely low-abundance species. These two problems are common for real metagenomic datasets. Binning method...

متن کامل

Entity Matching on Web Tables: a Table Embeddings approach for Blocking

Entity matching, or record linkage, is the task of identifying records that refer to the same entity. Naive entity matching techniques (i.e., brute-force pairwise comparisons) have quadratic complexity. A typical shortcut to the problem is to employ blocking techniques to reduce the number of comparisons, i.e. to partition the data in several blocks and only compare records within the same bloc...

متن کامل

Clustering-Based Production-Line Binning of ICs Based on IDDQ

A clustering-based technique is proposed for production line testing and real time binning of ICs. This paper presents a two-phase approach. The first phase involves off-line clustering and cluster characterization based on prior data. In the second phase, each device is sorted based on its quality attributes, into bins associated with specific quality and cost parameters. This allows fast real...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010